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Abstract — We study the problem of extracting a prescribed 
number of random bits by reading the smallest possible number 
of symbols from non-ideal stochastic processes. The related 
interval algorithm proposed by Han and Hoshi has asymptotically 
optimal performance; however, it assumes that the distribution 
of the input stochastic process is known. The motivation for 
our work is the fact that, in practice, sources of randomness 
have inherent correlations and are affected by measurement's 
noise. Namely, it is hard to obtain an accurate estimation of the 
distribution. This challenge was addressed by the concepts of 
seeded and seedless extractors that can handle general random 
sources with unknown distributions. However, known seeded 
and seedless extractors provide extraction efficiencies that are 
substantially smaller than Shannon's entropy limit. Our main 
contribution is the design of extractors that have a variable input- 
length and a fixed output length, are efficient in the consumption 
of symbols from the source, are capable of generating random bits 
from general stochastic processes and approach the information 
theoretic upper bound on efficiency. 

Index Terms — Randomness Extraction, Imperfect Stochastic 
Processes, Variable-Length Extractors. 



I. Introduction 

WE study the problem of extracting a prescribed number 
of random bits by reading the smallest possible number 
of symbols from imperfect stochastic processes. For perfect 
stochastic processes, including processes with known accurate 
distributions or perfect biased coins, this problem has been 
well studied. It dates back to von Neumann [9] who considered 
the problem of generating random bits from a biased coin 
with unknown probability. Recently, in l30l . we improved von 
Neumann's scheme and introduced an algorithm that generates 
'random bit streams' from biased coins, uses bounded space 
and runs in expected linear time. This algorithm can generate 
a prescribed number of random bits with an asymptotically 
optimal efficiency. On the other hand, efficient algorithms 
have also been developed for extracting randomness from 
any known stochastic process (whose distribution is given). 
In fl3l , Knuth and Yao presented a simple procedure for 
generating sequences with arbitrary probability distributions 
from an unbiased coin (the probability of H and T is 
In HI, Abrahams considered a source of biased coin whose 
distribution is an integer power of a noninteger. Han and Hoshi 
[10 1 studied the general problem and proposed an interval 
algorithm that generates a prescribed number of random 
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bits from any known stochastic process and achieves the 
information-theoretic upper bound on efficiency. However, in 
practice, sources of stochastic processes have inherent corre- 
lations and are affected by measurement's noise, hence, they 
are not perfect. Existing algorithms for extracting randomness 
from perfect stochastic processes cannot work for imperfect 
stochastic processes, where uncertainty exists. 

To extract randomness from an imperfect stochastic process, 
one approach is to apply a seeded or seedless extractor to a 
sequence generated by the process that contains a sufficient 
amount of randomness, and we call this approach as a fixed- 
length extractor for stochastic processes since all the possible 
input sequences have the same fixed length. Efficient construc- 
tions of seeded or seedless extractors have been extensively 
studied in last two decades, and it shows that the number of 
random bits extracted by them can approach the source's min- 
entropy asymptotically Q, H2, COD, 03, E2- Although 
fixed-length extractors can generate random bits with good 
quality from imperfect stochastic processes, their efficiencies 
are not close to the optimality. Here, we define the efficiency 
of an extractor for stochastic processes as the asymptotic ratio 
between the number of extracted random bits and the entropy 
of its input sequence (the entropy of its input sequence is 
proportional to the expected input length if the stochastic 
process is stationary ergodic), which is upper bounded by 1 
since the process of extracting randomness does not increase 
entropy. Based on this definition, we can conclude that the 
efficiency of a fixed-length extractor is upper bounded by 
the ratio between the min-entropy and the entropy of the 
input sequence, which is usually several times smaller than 1. 
So fixed-length extractors are not very efficient in extracting 
randomness from stochastic processes. The intuition is that, 
in order to minimize the expected number of symbols read 
from an imperfect stochastic process, the length of the input 
sequence should be adaptive, not being fixed. 

The concept of min-entropy and entropy are defined as 
follows. 

Definition 1. Given a random source X on {0, 1}™, the min- 
entropy of X is defined as 

H min (X) = min log — -r— r. 

££{0,1}" P[X = x] 

The entropy of X is defined as 

1 



H(X) 



J2 P[X = x]lo S 



P[X = x] ' 



The following example is constructed for comparing entropy 
with min-entropy for a simple random variable. 
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Example 1. Let X be a random variable such that P[X = 
0] = 0.9 and P[X = 1] = 0.1, then H min (X) = 0.152 and 
H{X) — 0.469. In this case, the entropy of X is about three 
times its min-entropy. □ 

In this paper, we focus on the notion and constructions 
of variable-length extractors (short for variable-to-fixed length 
extractors), namely, extractors with variable input length and 
fixed output length. (Note that the interval algorithm proposed 
by Han and Hoshi [10] and the streaming algorithm proposed 
by us [30] are special cases of variable-length extractors). Our 
goal is to extract a prescribed number of random bits in the 
sense of statistical distance while minimizing the expected 
input cost, measured by the entropy of the input sequence 
(whose length is variable). To make this precise, we let 
d(lZ,A4) be the difference between two known stochastic 
processes 1Z and M., defined by 



d(TZ, M) = limsup max M , , 

n^oo XG{0,1}» l0g 2 pr^y 

where Pjz(x) is the probability of generating x from 1Z when 
the sequence length is |.t|, and Pm(x) is the probability of 
generating x from M. when the sequence length is \x\. 

A few models of imperfect stochastic processes are intro- 
duced and investigated, including, 

• Let M be a known stochastic process, we consider an 
arbitrary stochastic process 1Z such that d(7Z,A4) < (3 
for a constant 0. 

• We consider 7Z as an arbitrary stochastic process such 
that min J vi e g s e d{1Z,M) < for a constant 0, where 
Gs.e. denotes the set consisting of all stationary ergodic 
processes. 

Generally, given a real slight-unpredictable source 1Z, it 
is not easy to estimate the exact value of d(7Z,A4) for a 
stochastic process M. But its upper bound, i.e., 0, can be 
easily obtained. The parameter f3 describes how unpredictable 
the real source 1Z is, so we call it the uncertainty of 1Z. 
We prove that it is impossible to construct an extractor that 
achieves efficiency strictly larger than l — f3 for all the possible 
sources 1Z with uncertainty 0. Then we introduce several 
constructions of variable-length extractors, and show that their 
efficiencies can reach r\ > 1 — 0; that is, the construc- 
tions are asymptotically optimal. The proposed variable-length 
extractors have two benefits: (i) they are generalizations of 
algorithms for perfect sources to address general imperfect 
sources; and (ii) they bridge the gap between min-entropy and 
entropy on efficiency. 

The following example is constructed to compare the perfor- 
mances of a variable-length extractor and a fixed-length extrac- 
tor when extracting randomness from a slightly-unpredictable 
independent process. 

Example 2. Consider an independent process X1X2X3... such 
that P[xi = 1] € [0.9, 0.91], then it can be obtained that 
< 0.0315. For this source, a variable-length extractor can 
generate random bits with efficiency at least 1 — j3 = 0.9685 
that is very close to the upper bound 1. In comparison, fixed- 
length extractors can only reach the efficiency at most 0.3117. 



The remainder of this paper is organized as follows. Sec- 
tion HI] presents background and related results. In Section 
Hill we demonstrate that one cannot construct a variable- 
length extractor with efficiency strictly larger than 1 — 
when the source has uncertainty 0, Then we focus on the 
seeded constructions of variable-length extractors, namely, we 
use a small number of additional truly random bits as the 
seed (catalyst). Three different constructions are provided and 
analyzed in Section IIVI Section [V] and Section [VI] separately. 
All these constructions have efficiencies lower bounded by 
1 — 0, implying their optimality. Finally, we discuss seedless 
constructions of variable-length extractors for some types of 
random sources in Section IVII1 followed by the concluding 
remarks. 

II. Preliminaries 

A. Statistical Distance 

Statistical Distance is used in computer science to measure 
the difference between two distributions. Let X and Y be 
two random sequences with range {0, l} m , then the statistical 
distance between X and Y is defined as 

||X-y||= max \P[T(X) = 1] - P\T(Y) = 111 

over a boolean function T. We say that X and Y are e-close if 
— Y\\ < e. According to this definition, we can also write 

\\X-Y\\ = \ \P[X = x}-P[Y = x]\<e. 

It is equivalent to the former expression. 

Let U m denote the uniform distribution on {0, l} m . In order 
to let a sequence Y to be able to take place of the truly random 
bits in a randomized application, we let Y be e-close to U m , 
where e is small enough. In this case, the extra probability error 
introduced by this replacement is at most e. In this paper, we 
want to extract m almost-random bits such that they form a 
sequence e-close to the uniform distribution U m on {0, l} m 
with specified small e > 0, i.e., 

Il^-M <c 

B. Seeded Extractors 

In 1990, Zuckerman introduced a general model of weak 
random sources, called fc-sources, namely whose min-entropy 
is at least k [3211 . It was shown that given a source on 
{0, 1}™ with min-entropy k < n, it is impossible to devise 
a single function that extracts even one bit of randomness. 
This observation led to the introduction of seeded extractors, 
which use a small number of additional truly random bits as 
the seed (catalyst). When simulating a probabilistic algorithm, 
one can simply eliminate the requirement of truly random bits 
by enumerating all possible strings for the seed and taking a 
majority vote on the final results. There are a variety of very 
efficient constructions of seeded extractors, summarized in Q, 
|[T6l . Il22l . Mathematically, a seeded extractor is a function, 

E : {0,1}" x {0,l} d -y {0,1}"\ 
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such that for every distribution X on {0, 1}™ with H m { n (X) > 
k, the distribution E(X, Ud) is e-close to the uniform distri- 
bution U m . Here, d is the seed length, and we call such an 
extractor as a (k, e) extractor. There are a lot of works focusing 
on efficient constructions of seeded extractors. A standard 
application of the probabilistic method ifTTl shows that there 
exists a seeded extractor which can extract asymptotically 
H m i n (X) random bits with log(n — H m - m (X)) additional truly 
random bits. Recently, Gurus wami, Umans and Vadhan [9| 
provided an explicit construction of seeded extractors, whose 
efficiency is very close to the bound obtained based on the 
probabilistic method. Their main result is described as follows: 

Lemma 1. J9J For every constant a > 0, and all positive 
integers n, k and all e > 0, there is an explicit construction 
of a (k,e) extractor E : {0,1}™ x {0, l} d ->• {0,l} m with 
d < log n + 0(log(fc/e)) and m > (1 — a)k. 

The above result implies that given any source X £ {0,1}™ 
with min-entropy k, if > (l+a)m with a > 0, we can always 
construct a seeded extractor to generates a random sequence 
Y e {0, l}™ 1 that is e-close to the uniform distribution. In this 
case, the seed length d < logn + 0(log(fc/e)) depends on the 
input length n and the parameter e. 

C. Seedless Extractors 

In the last decade, the concept of seedless (deterministic) 
extractors has attracted renewed interests, motivated by the 
reduction of the computational complexity for simulating 
probabilistic algorithms as well as some requirements in 
cryptography [6 |. Several specific classes of sources have been 
studied, including independent sources, which can be divided 
into several independent parts containing certain amount of 
randomness 0, ED, lEUl : bit- fixing sources, where some 
bits in a binary sequence are truly random and the remaining 
bits are fixed Q], 0, IfTTl ; samplable sources, where the 
source is generated by a process that has a bounded amount 
of computational resources like space [12], [25|. For example, 
suppose that we have multiple independent sources with the 
same length n. It is known how to extract from two sources 
when the min-entropy in each is > 0.5n [20 1 or slightly less 
than 0.5n [3], how to extract from 0(1/7) sources if the min- 
entropy in each is at least n 1 [ 18 1. All these constructions have 
exponentially small error, and they are able to extract Q(k) 
random bits. 

Both seeded extractors and seedless extractors described 
above have fixed input length, fixed seed length (d = 
for seedless extractors) and fixed output length. So we call 
them fixed-length extractors. To apply fixed-length extractors 
in extracting randomness from a stochastic process, it needs 
to first read a sequence of fixed length, whose min-entropy 
is strictly larger than the number of random bits that we 
need to generate. Fixed-length extractors can generate random 
bits of good quality from imperfect stochastic processes, but 
they usually consume more incoming symbols than what are 
necessarily required. To increase information efficiency, we let 
the length of input sequences be adaptive, hence, we have the 
concept of 'variable-length extractors'. 



D. Variable-Length Extractors 

A variable-length extractor is an extractor with variable in- 
put length and fixed output length. When applying a variable- 
length extractor to a stochastic process, it reads incoming 
symbols one by one until the whole incoming sequence meets 
certain criterion, then it maps the incoming sequence into 
a binary sequence of fixed length as the output. Depending 
on the sources, the construction may require a small number 
of additional truly random bits as the seed. Hence, we have 
seeded variable-length extractors and seedless variable-length 
extractors. 

A seeded variable-length extractor is a function, 

V E :S p x{0,l} d ^{0,l} m , 

such that given a real source 1Z, the output sequence is e-close 
to the uniform distribution U m . Here, S p is the set consisting of 
all possible input sequences, called the input set. It is complete 
and prefix-free. The input sequence is compete, that means, 
any infinite sequence has a prefix in the set; so when reading 
symbols from any source, we can always meet a sequence in 
the set. Then we stop reading and map this sequence into a 
binary sequence of length m. Being prefix-free is not very 
necessary; it ensures that all the sequences in S p are possible 
to read. 

A general procedure of extracting randomness by using 
variable-length extractors can be divided into three steps: 
1) Determining an input set S p such that its min-entropy 
based on the real source 1Z is at least k, namely, 



where k > (1 + a)m for any a > 0. 

2) We construct an injective function 

V : S p -> {0, 1}", 

to map the sequences in S p into binary sequences of 
length m. We read symbols from the source 1Z one by 
one until the current incoming sequence matches one in 
S p . This incoming sequence is then mapped to a binary 
sequence of length n based on function V. As a result, 
we get a random sequence Z with length n and min- 
entropy k (since V is injective). 

3) Since k = (1 + a) with an a > 0, according to Lemma 
Q] we can always find a seeded extractor, 

£:{0,l} n x{0,l} d ^{0,l} m 

that can extract m almost-random bits from a source 
with min-entropy k. By applying this seeded extractor 
E to the sequence Z, we get a random sequence of 
length m that is e-close to the uniform distribution U m . 
Here, the seed length d < log n + 0(log(k/e)). 
We can see that the construction of a variable-length ex- 
tractor is a cascade of a function V and a seeded extractor E, 
i.e., 

V B = E$$V. 

Note that our requirement is to extract a sequence of m 
almost-random bits that is e-close to the uniform distribution 
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U m . The key of constructing variable-length extractors is to 
find the input set S p with min-entropy fc, even the distribution 
of the real source 1Z is slightly unpredictable, such that the 
expected length of the sequences in S p is minimized. For 
stationary ergodic processes, minimizing the expected length 
is equivalent to minimizing the entropy of the sequences in 
S p asymptotically (this will be discussed in this section). 

For some specific types of sources, including independent 
sources and samplable sources, by applying the ideas in |[T9l 
and [12'] we can remove the requirement of truly random bits 
without degrading the asymptotic performance. As a result, 
we have seedless variable-length extractors. For example, if 
the source 1Z is an independent process, we can first apply 
the method in |19| to extract d almost-random bits from the 
first 0(log — ) bits, and then use them as the seed of a seeded 
variable-length extractor to extract randomness from the rest of 
the process. The detailed discussions will be given in Section 

rvni 

III. Efficiency and Uncertainty 
A. Efficiency 

To consider the performance of an extractor, we define its 
efficiency as the asymptotical ratio between the output length 
and the total entropy of all its inputs. So the efficiency of an 
extractor can be written as 

77 = lim 



H n (X m ) + d' 

such that the output sequence is e-close to the uniform 
distribution U m on {0, l} m , where e is small, d is the seed 
length, m is the output length, and H-jz(X m ) is the entropy of 
the input sequence X m with range on S p . In our constructions, 
d < logn + 0(log(m/e)), which is ignorable compared to 
Hiz(X m ) when m — > 00. Hence, we can write 

77 = lim 



>oo H n (X m ) 

In the definition, we use the entropy of the input sequence 
rather than the expected input length, because the source that 
we considered may not be stationary ergodic. It needs to 
mention that, in seeded constructions, the value of d is also 
an important parameter although it is much smaller than m. 
The problem of minimizing the seed length d can be studied 
separately from minimizing the entropy of the input sequence, 
and it will be addressed in this paper. 

First, we demonstrate that if a distribution is e-close to the 
uniform distribution U m , then the entropy of this distribution 
is asymptotically m for any e < 1. 

Lemma 2. Let X be a random sequence on {0, l} m that is 
e-close to the uniform distribution U m , then 



m - log 2 



1 



1 - e 



< H{X) < m. 



Proof: Since there are totally 2 m possible assignments 
for X, it is easy to get H(X) < m. So we only need to prove 
that 



Let p(x) denote P[X = x] for x £ {0, l} m . Since X is 
e-close to the uniform distribution U m , we have 

\ £ ||p(z)-2- m ||<e. 

xG{0,l} m 

Then the lower bound of H{X) can be written as 

min E P( x ) lo g2 -TT 

v f— ' pvx) 

subject to 

p(x) > 0,Vx e {0,l} m ; 

E p(*) = i; 

a;e{o,i} m 
E lb(*)-2- m |l<2e. 

zG{0,l} m 

Obviously, the optimal solution of the above problem hap- 
pens at 

E lb(*)-2- m ||=2e. 

To solve the problem based on Lagrange Multipliers, we let 

A(p)= E p{x)log 2 ^-- + X 1 ( E P^)- 1 ) 
xe{o,i} m ^ ' xe{o,i} m 

+A 2 ( E lb(z)-2- m ||-2e). 

xG{0.1} m 

If p(x) > with x £ {0, 1}™ is a solution of the above 
question, then 



d(p(x)) 



i.e., 



^^g±i + Ai + A 2 = if 2- m < p(x) < 1, 

In p(a;) + l 



In 2 



+ Ai - A 2 = if < p(x) < 2" 



H(X) > m - log 2 



1 



1 -e 



So there exists two constants a and b with < a < 2 m < 
b < 1, such that, 

p(x)=a if 2~ m < p(x) < 1, 
p\x)=b if <p{x) < 2- m . 

Assume that there are t assignments of x with p(x) = a, 
then there are 2™ — t assignments of x with p(x) = b. Hence, 
the problem is converted to the one over a, b, t, i.e., 

min ta log - + (2 m - t)b log \ . 

a,b,t a b 

subject to 

< t < 2 m ; 

ta+(2 m -t)b= 1; (1) 

t(2~ m - a) + (2 m - t)(b - 2- m ) = 2e. (2) 

From Equ. (HJ and (f2]), we get 

a = 2- m --, b = 2- ,n + — - — . 

t 2 m - t 
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So the question is finding the optimal t that minimizes 



-t(2- 



-(2 m -t){2- m + 



r)log 2 (2- 



t' t 
e 



r )log 2 (2-™ + 



subject to 



< t < 



The optimal solution is t* = ^ n m i s case, the entropy 
of X is 

1 



H(X) = log(2 m - t) = m - log 2 



1 



which is the lower bound. 

This completes the proof. ■ 

In the following lemma, we show that for any extractor, 
its efficiency is upper bounded by 1. The reason is that the 
amount of information, i.e., entropy, does not increase during 
the process of randomness extraction. 

Lemma 3. For any extractor with seed length d and output 
length m, if d = o(m), its efficiency r\ < 1. 

Proof: We consider fixed-length extractors as a special 
case of variable-length extractors, and consider seedless ex- 
tractors as a special case of seeded extractors when d = 0. So 
our proof only focus on seeded variable-length extractors. 

A main observation is that for any extractor, the entropy 
of its output sequence is bounded by the entropy of the input 
sequence plus the entropy of the seed, since the process of 
extracting randomness cannot create new randomness. 

For the output sequence, denoted by Y, it is e-close to the 
uniform distribution U m . According to Lemma |2j its entropy 
is 



H n (Y) > m - log 2 



1 



1 - e 

The total entropy of the inputs is Hn(X m ) + d. Hence, 
H n (Y) < H n (X 

rn ) i i*. 

As a result, the efficiency of the extractor is 
m ,. H n (Y) 



V 



lim 



Hn(X m ) 
This completes the proof. 



= lim 



H n (X m ) + d 



< 1. 



If 1Z is a stationary ergodic process, we define its entropy 
rate as 

H(X l ) 



h(K) 



lim 



I 



where X 1 is a random sequence of length I generated from 
the source 1Z. In this case, the entropy of the input sequence 
on S p is proportional to the expected input length. 

Lemma 4. Given a stationary ergodic source 7Z, let X m be 
the input sequence of a variable-length extractor that has an 
output length m. Then 

m->oo hj^i [| A m |J 
where En\\X m \\ is the expected input length. 



Proof: X m is a random sequence from S p based on 
the distribution of 1Z. Let l\ be the minimum length of the 
sequences in S p , as m — > oo, l\ — > oo. Now, we define 

k = h + (i- 1) logZi for all i > 1. 

Based on them, we divide all the sequences in S p into subsets 

Si = {x\x G S p , k < \x\ < l l+ i - 1} 

for i > 1. 

Let Pi = P n {X m E Si), then 

H n (X m ) > Y^[(J2p^ H ^ X t i+ i\ X i~^ \Xm\ > h)}, 

i j>i 

where l = 0, ^2j >i Pj is the probability that \X m \ > Li, 
and X% is a sequence of X m from the ath element to the 6th 
element. 

Since X m is generated from a stationary ergodic process, 
and li — — » oo as m — > oo, we can get 

H n {X\l_ i+l \X l r\\X m \ > h) -> (h - k^)h(TZ). 

As a result, as li — > oo, we have 

Hn(X rn ) > (l-e)^(^ Pi )ft-'i-i)W 

i j>i 

= (l-e)J2Pihh(K), 



for an arbitrary e > 0. 

Also considering the other direction, we can get that as 

li — > oo, 

H n (X m ) < (l + e)J2Pi l i+ih(1Z) 

i 

= (l + e)J2pi( l i + togh)h(1Z), 

i 

for an arbitrary e > 0. 

For the expected input length, i.e., _E^[|A' m |], it is easy to 
show that 

< E n [\X m \} < y^pA+i = y^ y Pi{h + logh). 

i i i 

So as rn — > oo, i.e., l\ — > oo, it yields 

,. H n (X m ) EiPikh(1Z) 
lim = lim — — 

m ^°° E n [\X m \\ m-^oo l^tPik 

= h(lZ). 

This completes the proof. ■ 



B. Sources and Uncertainty 

Given a source 1Z, if its distribution is known, we say that 
this source is a known stochastic process, and its uncertainty is 
0. In this paper, we mainly focus on those imperfect processes 
whose distributions are slightly unpredictable due to many 
factors like the existence of external adversaries. 
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First, given two known stochastic processes TZ and Ai, we 
let d(R,M) be the difference between TZ and Ai. Here, we 
define d(TZ, M) as 

log p k(x) 

d{TZ,M) = lim sup max 2 Pm . {x) , 

»->«, xe{o,i}" log 2 

where Pji (x) is the probability of generating x from TZ when 
the sequence length is \x\, and Pm{ x ) is the probability 
of generating x from Ai when the sequence length is |x|. 
Although there are some existing ways such as normalized 
Kullback-Leibler divergence to measure the difference be- 
tween two sources, with them it is not easy to estimate the 
uncertainty of a source and it is not easy to analyze the 
performances of constructed variable-length extractors. 

In the rest of this paper, we investigate a few models 
of unpredictable sources. Most natural source can be well 
described in those ways. 

1) The source TZ is an arbitrary stochastic process such that 
d(TZ, Ai) < (3 for a constant (3 e [0, 1] and a known 
stochastic process Ai. 

2) 1Z is an arbitrary stochastic process such that there 
exists a stationary ergodic process Ai (whose dis- 
tribution is unknown) and d(1Z,M) < f3; that is, 
mmj^ e g s e d(TZ,Ai) < (3, where Q s .e. denotes the set 
consisting of all stationary ergodic processes. 

In both the models, we call (3 as the uncertainty of the 
source 1Z. In the real world, (3 can be easily estimated without 
knowing the distribution of the processes. It just reflects how 
unpredictable the real source TZ is. 

To construct variable-length extractors, we only care about 
the possible input sequences, namely, those in S p . Hence, 
for the case of finite length, d p (R, Ai) is a more important 
parameter for us, defined by 



d p (R, Ai) = max 



loj 



'2 P M (x) 



As the number of required random bits m increases, 
d p (TZ,Ai) quickly converge to d(1Z,M). And we can write 

d p (1Z,M) =d{lZ,M) + e p 

for a very small constant e p . As m — >• oo, e p — > 0. In this 
case, the upper bound of d p (1Z, Ai) or minx e g s e d p (1Z, Ai) 
is 

/3 p = j3 + e p . 

Example 3. Let X\X<i... £ {0,1}* be a sequence generated 
from an independent source TZ such that 

Vi > l,P[xi = 1] e [0.8,0.82]. 

If we let Ai be a biased coin with probability 0.8132, then 



,!og 2 



max d(7l,M) 

possible TZ 



0.2 



— ,,,., v . "' 0. I.SliS ^°S2 Q.SI 32 \ 

— max^- ! ; -, i ) 

10 §2 0.1868 10 S2 0.8132 



0.82 



0.0405. 



According to our definition, d(M 1 7Z) < /3 if and only if 

Pn(x) < P M {xf- tj 

for all x £ {0, 1}°° with \x\ — > oo. This is a condition that is 
very easy to be satisfied by many natural stochastic processes 
for a small f3. 

Lemma 5. If d(1Z, M) — > 0, we have 
Pn(x)^P M (x) 

for all x G {0,1}*. 

C. Efficiency and Uncertainty 

In this subsection, we investigate the relation between the 
efficiency and uncertainty. We show that given a stochastic 
process TZ with uncertainty /3, as described in the previous 
subsection, one cannot construct a variable-length extractor 
with efficiency strictly larger than 1 — (3 for all the possibilities 
of TZ. 

Let us first consider a simple example: let X be a random 
sequence with the uniform distribution on {0, 1}™ and let Y 
be an arbitrary random sequence on {0, 1}™ such that 

, P[Y=x] 

g2p[ *r ] <av*€{q,i}"- 

Now, we show that from the source Y, one cannot construct 
an extractor with efficiency strictly larger than 1 — j3. To see 
this, we consider an extractor / with output length m, and a 
source Y with 

P[r-y]G{0,2-"( 1 -«},VyG{0,ir. 

For this a source Y, its entropy is H(Y) = n(l — /?). In 
order to make sure the output sequence of /, denoted by Z, 
is e-close to U m , it has 



lim 



m->oo n(l — (3) m->oo H(Y) 

So we cannot generate more than n(l — (3) random bits 
asymptotically. In this case, if we apply the seeded extractor 
/ to the random sequence X, which is a possibility of Y, then 
the efficiency is 



i] = lim 



m 



lim ™ < 1 

m— >oo ft 



(3. 



□ 



H(X) 

So there does not exist a seeded extractor that can extract 
randomness from an arbitrary Y and its efficiency is strictly 
larger than 1 — (3. Here, (3 is the uncertainty of the source. 

Theorem 6. Let M be a known stochastic process, and TZ 
be an arbitrary stochastic process such that d(TZ,M) < f3, 
then one cannot construct a variable-length extractor whose 
efficiency is strictly larger than 1 — (3 for all possible TZ. 

Proof: Let / be a variable-length extractor whose input 
sequence is a random sequence X m on S p and its output 
sequence is a random sequence Y on {0, l} m . Assume that 
as m — >• oo, / can extract from an arbitrary TZ such that the 
output sequence Y is e-close to U m . 
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Let h = HM(X m ) be the entropy of the input sequence 
based on the distribution of Ai, then we want to show 
that there exists a process 7Z such that d(1Z,A4) < (3 and 
Hn{X m ) < h(l - /3) as m -4 oo. 

To find such a process 72., we order all the elements in S p 
as xi, a; 2 , X3, ... such that 

Pm(^i) > Pm(x 2 ) > Pm(x 3 ) > ... 
Then we divide all these elements into groups, 

such that the total probability of the elements in each group 
is almost the probability of its first element to the power of 
1 - /3, i.e., 

ij + l 

< P M {xi j+1 f- p - P M{x k ) < P M (x h +i), 

for all j > 0, where io = 0. 

Let A = {x\, Xjj+i, Xi 2 +i, ...} be the set consisting of the 
first elements of all the groups. Now, we consider a possibility 
of 1Z in the following way: for all x G {x\, x^+i, £i 2 +i, ...}, 
its probability is 

ij+i 

Pn{x)= E p M(xk), if x = x ij+ i; 

k—ij + 1 

For all x € S p /A — S p /{xi,Xi 1+ i,Xi 2+ i, ...}, its probability 
is 

Pr{x)=0. 

For this source 1Z, the entropy of the input sequence is 

H n (X m ) = g Pn(x)\og 2 -^- y 

As m — > 00, we have 
Hiz(X m ) 
= P-R-(x)\og 2 —— 

- d-/3)E^)io g2 ^ 

j>0k=ij+l J iy l i +LJ 

j>0fe=ij-+l 7K ; 

= (l-/))ff M (A- m ) 

According to Lemma [2] as m — > 00, 
Furthermore, we can get 



Htz{Y) 



-> 1. 



lim < 1, 



Hn(X m ) 



it implies that 



otherwise, the output sequence cannot be e-close to the uni- 
form distribution U m . 

If we apply the extractor / to the source Ai, which is also 
a possibility for 1Z, then its efficiency is 



V 



m 

lim — < 1 

m— toe h 



lim 



m^-oo (1 — f3)h 



< 1, 



So it is impossible to construct a variable-length extractor 
with efficiency strictly larger than 1 — /3 for all the possibilities 
of the source 1Z. This completes the proof. ■ 

With the same proof, we can also get the following theorem. 

Theorem 7. Let 1Z be an arbitrary stochastic process such 
that d{TZ, M.) < (3 for a stationary ergodic process M. with 
unknown distribution, , then one cannot construct a variable- 
length extractor whose efficiency is strictly larger than 1 — j3 
for all possible 1Z. 

The above theorems show that one cannot construct an 
extractor whose efficiency is strictly larger than 1 — /3 for 
all the possible source 1Z. Here, (3 is an important parameter 
that measures the uncertainty of a real source 1Z, either to a 
known process or to the nearest stationary ergodic process. In 
the next a few sections, we will present a few constructions for 
efficiently extracting randomness from the sources described 
in this section. We show that their efficiency 77 satisfies 

1 -13 < r) < 1. 

That means the bound 1 — (3 is actually achievable and 
the constructions proposed in this paper are asymptotically 
optimal on efficiency. 

IV. Construction I: Approximated by Known 
Processes 

In this section, we consider those sources which can be 
approximated by a known stochastic process M., namely, an 
arbitrary process 1Z with d(7Z, Ai) < j3 for a known process 
M.. We say that a stochastic process M. is known if its 
distribution is given, i.e., Pm(x) can be easily calculated for 
any x £ {0, 1}*. Note that this process M. is not necessary 
to be stationary or ergodic. For instance, M can be an 
independent process Z\Z%... £ {0, 1}* such that 

1 + sin(i/10) 

> i,p M (z t = 1) = y 1 — 1 - 

A. Construction 

Our goal is to extract randomness from an imperfect random 
source 1Z. The problem is that we do not know the exact 
distribution of 1Z, but we know that it can be approximated 
by a known process M.. So we can use the distribution of 
Ai to estimate the distribution of 1Z. As a result, we have the 
following procedure to extract m almost-random bits. 

The idea of the procedure is first producing a random 
sequence of length n and min-entropy k = m(l + a) with 
a > 0, from which we can further obtain a sequence e-close 
to the uniform distribution U m by applying a (k, e) seeded 
extractor. According to the results of seeded extractors, this 
constant a > can be arbitrarily small. 
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Construction 1. Assume the real source 1Z is an arbitrary 
stochastic process such that d(lZ,A4) < j3 for a known 
process Ai. Then we extract m almost-random bits from 1Z 
based on the following procedure. 

1) Read input bits one by one from 1Z until we get an input 
sequence x £ {0, 1}* such that 

1 k 

Pm{x) l-Pp 

where f3 p — f3 + e p with e p > and k = m(l + a) with 
a > 0. The small constant e p has value depending on 
the input set S p ; as m — > oo, e p — > 0. The constant a 
can be arbitrarily small. 

2) Let n be the maximum length of all the possible input 
sequences, then 

n = argmin{Z £ N|Vy £ {0, 1}', 



log 2 



> 



Pm(v) 'l-Pp 



}■ 



If \x\ < n, we extend the length of x to n by adding 
n — \x\ trivial zeros at the end. Since x is randomly 
generated, from the above procedure we get a random 
sequence Z of length n. And it can be proved that this 
random sequence has min-entropy k. 
3) Applying a (k, e) extractor to Z yields a binary sequence 
of length m that is e-close to the uniform distribution 
U m . □ 

The following example is provided for comparing this 
construction with fixed-length constructions. 

Example 4. Let M. be a biased coin with probability 0.8 {of 
being 1 ). If . „ = 2, then we can get the input set 

S p = {0, 10, 110, 1110, 11110, 111110, 1111110, 1111111}. 

In this case, the expected input length is strictly smaller than 
7. For fixed-length constructions, to get a random sequence 
with min-entropy at least 2, we have to read 7 input bits 
independent of the context. It is less efficient than the former 
method. □ 

Theorem 8. Construction [7] generates a random sequence of 
length m that is e-close to U m . 

Proof: We only need to prove that given a source 1Z and a 
model M. with d p (7Z,A4) < f3 p , it yields a random sequence 
Z with min-entropy at least k. 

According to the definition of d p (7Z,A4), for all x € S p , 

log 2 



=2 P M (x) 



<fi p . 



l0 §2 PMT) 

Based on the construction, for all x e S p 
1 k 

The two inequalities above yield that 

1 



for all x £ S p . 

Since the second step, i.e., adding trivial zeros, does not 
reduce the min-entropy of S p . As a result, we get a random 
sequence Z of length n and with min-entropy at least k. 

Since k = m(l + a) with a > 0, according to Lemma 
Q] we can construct a seeded extractor that applies to the 
sequence Z and generates a binary sequence e-close to the 
uniform distribution U m . 

This completes the proof. ■ 



B. Efficiency Analysis 

Now, we study the efficiency of Construction Q] According 
to our definition, given a construction, its efficiency is 

rn 



rj = lim 



>oo H n (X m ) 

Theorem 9. Given a real source 1Z and a known process M. 
such that d(7Z,M) < f3, then the efficiency of Construction^ 
is 

1 ~P < V < 1- 

Proof: Since ?; is always upper bounded by 1, we only 
need to show that 77 > 1 — /3. 

According to Lemma Q] as m — > 00, we have 

lim — = 1. 

m— >oo m 

Now, let us consider the number of elements in S p , i.e., 
\S P \. To calculate \S P \, we let 

S' p — {x[l : \x\ — l}\x £ S p }, 

where x[l : |x| — 1] is the prefix of x of length \x\ — 1, then 
for all y £ S' p , 



log 2 



1 k 

< 



Hence, 



k 



log 2 \S' p \ < T 



log 2 



Pr(x) 



>k, 



It is easy to see that \S P \ < 2\S' p \, so 

log 2 \S P \ < —5- + 1. 
1 Pp 

Let X m be the input sequence, then 
lim < lim 

k— >oo k k— >oo k 

1 1 

< lim — = -. 

fc^oo 1 — ftp 1 — p 

Finally, it yields 

m 

V= Imi ( , > 1-/3. 

m-*oo H n {X m ) 

This completes the proof. ■ 

We see that the efficiency of the above construction is 
between 1 — and 1, As shown in Theorem [6] the gap j3, 
introduced by the uncertainty of the real source TZ, cannot 
be smaller. Our construction is asymptotically optimal in the 



9 



sense that we cannot find a variable-length extractor with 
efficiency definitely larger than 1 — j3. 

Corollary 10. Given a real source TZ and a known process 
M such that d(TZ,M) < (3, then as ft — > 0, the efficiency of 
Construction [JJ is 

r\ ->■ 1. 

In this case, the efficiency of the construction can achieve 
Shannon's limit. 

If TZ is a stationary ergodic process, we can also get the 
following result. 

Corollary 11. Given a stationary ergodic processlZ and a 
known process A4 such that d(lZ,A4) < (3, for the expected 
input length of Construction [JJ we have 

1 ,. E\\X m \] 1 
< li m Ll m|J < , 

h(R) ~ m^oc m - (1 -j3)h(K)' 

where h(TZ) is the entropy rate of the source TZ. 

Proof: This conclusion is immediate following Lemma H] 
and Theorem [9] ■ 



V. Construction II: Approximately Biased Coins 

In this section, we use a general ideal model such as a 
biased coin or a Markov chain to approximate the real source 
1Z. Here, we do not care about the specific parameters of 
the ideal model. The reason is, in some cases, the source 1Z 
is very close to an ideal source but we cannot (or do not 
want to) estimate the parameters accurately. As a result, we 
introduce a construction by exploring the characters of biased 
coins or Markov chains. For simplicity, we only discuss the 
case that the ideal model is a biased coin, and the same idea 
can be generalized when the ideal model is a Markov chain. 
Specifically, let Qh.c. denote the set consisting of all the models 
of biased coins with different probabilities, and we consider 
1Z as an arbitrary stochastic process such that 

min d(K,M)<p. 

A. Construction 

The idea of the construction is similar as Construction Q] 
i.e., we first produce a random sequence of length n and with 
min-entropy k = m(l + a) for a > 0, from which we can 
further obtain a sequence e-close to the uniform distribution 
U m by applying a (fc, e) seeded extractor. 

Construction 2. Assume the real source 1Z is an arbitrary 
stochastic process such that 

min d(1Z,M)</3 

M£G b . c . 

for a constant ft. Then we extract m almost-random bits from 
1Z based on the following procedure. 

1) Read input bits one by one from TZ until we get an input 
sequence x € {0, 1}* such that 

ka + k\ \ k 



log 2 



max(l, min(fco, fci))/ 1 — ftp 



> 



where fco I s me number of zeros in x, k\ is the number 
of ones in x, /3 p = /3 + e p with e p > and k = m(l + a) 
with a > 0. The small constant e p has value depending 
on the input set S p ; as m — > oo, e p — > 0. The constant 
a can be arbitrarily small. 

2) Since the input sequence x can be very long, we map it 
into a sequence z affixed length n such that 

z = [/( feo > fcl ),min(fco,fci),r(x)], 

where I(ko>ki) = 1 if and only if ko > hi, and r(x) 
is the rank of x among all the permutations of x with 
respect to the lexicographic order. Since x is randomly 
generated, the above procedure leads us to a random 
sequence Z of length n. 

3) Applying a (fc, e) extractor to Z yields a random se- 
quence of length m that is e-close to U m . □ 

To see that the construction above works, we need to show 
that the random sequence Z obtained after the second step has 
min-entropy at least k, and its length n is well bounded. 

Lemma 12. Given a source TZ with min J v) e t; i)c d(lZ, A4) < 
j3, Construction\2\yields a random sequence Z with length 



n < 



1 + ^T^L +1 ^ 



2k 



Proof: 1) I(k >k 1 ) can be represented as 1 bit. 
2) Without loss of generality, we assume ko < k\. Accord- 
ing to our construction, 

k c 1 

for k > 1, 



l0g2( ko-1 ) < l-P P 



and 



log 2 



< 



for fco = or ko = 1. 



Then 



1 < log. 



2k - 1 



< log 2 
k 



- 1 ko-1 
ko + ki — 1 



1 



< 



1-Pp 



So min(fco) fci) can be represented as [log 2 ( 1 _ fc 3 +1)1 bits. 
3) Let us consider the number of permutations of x, denoted 
by N(x). If fc > 1, then 



log 2 N(x) 



< 



log 



< log 2 



fco + fci 
Ji fco 

fco + fci — 1 

ko-1 



log; 



fco + fci 



1-pp 



log 2 



fco + fci 
fco 



If fco = 1, then 



If fc = 0, then 



log 2 iV(x) =0. 
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Based on the analysis above, we can get 

2k 



log 2 N(x) < 



Hence, r(x) can be represented as [ ] bits. 

This completes the proof. ■ 

Let l a denote the all-one vector of length a, then we get 
the following result. 

Theorem 13. Construction\2\generates a random sequence of 
length m that is e-close to U m if P-n(l a ) < 2~ k 1 P n (0 a ) < 
2- k for a = 2 Lt ^ J . 

Proof: Since the mapping in the second step is injective, 
it will not affect the min-entropy; we only need to prove that 
the input sequence has min-entropy k, i.e., 

lo S2 tj X , \ > fc,Vx e S pi 

where S p is the set consisting of all the possible input 
sequences. 

It is not hard to see that if min(fco, fci) > 1, 

1 



P M {x) < 



'k +k±\ 
v ko ) 



which leads to 



l k 
log 2 - — — > 



Pm(x) - l-0p 

Furthermore, based on the definition of d p (7Z,A4), we can 
get if min(fco, k\) > 1, 

1 



log 2 



> k. 



If min(/co, ^i) = 0, according to the condition in the lemma, 
we can also have the same result. 

Since k = m(l + a) with a > 0, according to Lemma 
[U we can construct a seeded extractor that applies to the 
sequence Z and generates a binary sequence e-close to the 
uniform distribution U m . 

This completes the proof. ■ 

Actually, the idea above can be easily generalized if M. 
is a Markov chain that best approximates the real source 1Z. 
The idea follows the main lemma in |29| that shows how to 
generate random bits with optimal efficiency from an arbitrary 
Markov chain. 

B. Efficiency Analysis 

For the efficiency of the construction, we can get the same 
bounds as Construction [T] 

Theorem 14. Given an arbitrary source 1Z such that 

min dCJl,M)<P, 
MeQb.o. 

if there exists a model M £ Gb.c. with probability p < \ of 
being 1 or and 



p> J d(K,M) log- 



lln2 
p~- 



then the efficiency of Construction \2\ is 
1-/3 < V < 1- 

Proof: Let Nk ,ki denote the number of input sequences 
with ko zeros and k\ ones in S p , and let PkaM be the 
probability based on 1Z of generating such a sequence. Let 
us define 

A = {{k 0l k 1 )\N koM >0}, 

then we can get 

H n {X m ) < H({p koM \(k M) G A}) 
+ ^2 PkoM^NkoM- 

(fc ,fc 1 )6i4 

According to the proof in the above theorem, min(A:o, &i) < 
._ fc ,, + 1. So there are totally at most 2( 1 k „ + 1) available 
pairs of (&o,fci). Hence 

H({p* 0jfcl |(Ab,Ai)eA})<tog 2 (2 + ( T ^ + l))=o(fc). 

Now, we write n — ko + k±. According to our method, if 

min(fcoi k±) >1, 



k + ki 
min(/co, k\ 

ko + ki — 1 
min(fco, fci) — 1 



> 2 1 -' 9 3» 



< 2 1 -"f 



Hence, given n, we get an upper bound for min(fco,fci), 
which is 

(n — l\ i 
i i j<2 1 -^}. (3) 

Note that if (lZ_\) > 2~^ , then t n is a nondecreasing 
function of n. Using the Stirling bounds on factorials yields 



lim -log 2 I " I = H(p), 

n->oc n \pn J 

where H is the binary entropy function. Hence, following (O, 
we can get 



k 



lim H(-) = lim , . 

n— >oo n n— >oc y \ — p p )n 



(4) 



Let P n denote the probability of having an input sequence 
of length at least n based on the distribution of 1Z. In this 
case, P n is a nonincreasing function of n. Let Q n denote the 
probability of having an input sequence of length at least n 
based on the distribution of M. e Qb.c, whose probability is 
p < 4. Since for all binary sequence x E {0, 1}™, 

lQ g2 E \ \ ^ nl °S2 - 5 
Pm{x) P 



we can get 



lo g2 5 7 \ < dnlog 2 -, 
Pm{x) p 



where d = d p (lZ, M). 
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Since P n = J2x€S P k( x ) and Qn = J2 x es P M {%) for 
some S C {0, 1}", it is not hard to prove that 



!og 2 tt < dn\og 2 -. 

Qn P 

According to Hoeffding's inequality, we can get 



(5) 



Q n < 2P[ki < tn] 

< 2P[- - P <- - p) 

n n 



Hence 

P n < 2~ dnl ° S2P Q n < 2e~ 1 ° S2pln2 ' dn ~ 2n(j3 ~'^") 2 (6) 
From this inequality, we see that P n — > as n — >• if 

-dlog 2 pln2-2(p- ^) 2 < 0. (7) 

Based on <j4j and (0, we can get that P n — > as n — >• if 

n I 
- > 



(l-/3 p )7J(p-^log 2 iY) 



Now, let a 



write 



(i-/3 t ,)H(p- x /diog2 lip) 



with e > 0, we can 



H n {X m ) < o(k)+ Pk M io &2 N k M 

ko,ki:ko-\-ki>ak 

+ PkoM lo &2 N k M- 
kg.ki : fco + fci <afc 

According to our analysis, if fco + fci > ak, as — > oo, 

-P« = X! Pfc °> fc i 



and log 2 Nk 0t k! < 2 1 _^ . If fco + fci < ak, then 



log 2 iV feDlfel < 



fc 



log 2 



fco + fci 



< 



1 - /9p ° 2 min(fc , fci) 1 - /3 p 
As a result, we can get 



o(fc). 



9fc fc 

H n (X m ) < o(A) + o(l)- - + (- 5- + o(*)) 



< 



1-jSp 



So 



lim 



o{k). 



>1-j9. 



Furthermore, based on the fact that limm^oo — = 1, we 
can get r\ > 1 — f3. It is known that 77 < 1, so it concludes the 
theorem. ■ 

Similar to Construction [T] this construction is also asymp- 
totically optimal in the sense that we cannot find a variable- 
length extractor with efficiency definitely larger than 1 — /3, as 
shown in Theorem [6] 

Corollary 15. Given an arbitrary source 1Z such that 
min d(K,M)<l3, 



then as j3 — > 0, the efficiency of Construction [2] is 

r\ -> 1. 

It is easy to see that as j3 — > 0, Construction |2] reaches 
the Shannon's limit on efficiency. If 1Z is a stationary ergodic 
process, we can also get the following corollary. 

Corollary 16. Given an arbitrary stationary ergodic source 
1Z such that 

min dCJl,M)<8, 

;/ there exists a model M. G Gb.c. with probability p < \ of 
being 1 or an<i 



p> Jd(ft,X)log 2 i^, 



f/ien /or f/ie expected input length of Construction [2] we Ziave 

1 



1 E\\X m \) 
< lim 11 IJ < 



/i(^) ~ m^oo m -(l-P)h{R,) 
where h(TZ) is the entropy rate ofTZ. 

VI. Construction III: Approximately Stationary 
Ergodic Processes 

In this section, we consider imperfect sources that are 
approximately stationary and ergodic. Here, we let 1Z be an 
arbitrary stochastic process such that d(7Z, M) < f3 for a 
stationary ergodic process M.. For these sources, universal 
data compression can be used to 'purify' input sequences, i.e., 
shortening their lengths while maintaining their entropies. In 
lf27ll . Visweswariah, Kulkarni and Verdu showed that optimal 
variable-length source codes asymptotically achieve optimal 
variable-length random bits generation in the sense of normal- 
ized divergence. Although their work only focused on ideal 
stationary ergodic processes and generates 'weaker' random 
bits, it motivates us to combine universal compression with 
fixed-length extractors for efficiently generating random bits 
from noisy stochastic processes. In this section, we will first 
introduce Lempel-Ziv code and then present its application in 
constructing variable-length extractors. 

A. Construction 

Lempel-Ziv code is a universal data compression scheme 
introduced by Ziv and Lempel PP . which is simple to 
implement and can achieve the asymptotically optimal rate for 
stationary ergodic sources. The idea of Lempel-Ziv code is to 
parse the source sequence into strings that have not appeared 
so far, as demonstrated by the following example. 

Example 5. Assume the input is 010111001110000..., then 
we parse it as strings 

0,1,01,11,00,111,000,... 

where each string is the shortest string that never appear 
before. That means all its prefixes have occurred earlier. 

Let c(n) be the number of strings obtained by parsing a 
sequence of length n. For each string, we describe its location 
with log c(n) bits. Given a string of length I, it can described 
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by (1) the location of its prefix of length I — 1, and (2) its last 
bit. Hence, the code for the above sequence is 

(000, 0), (000, 1), (001, 1), (010, 1), (001, 0), (100, 1), (101, 0), . 

where the first number in each pair indicates the prefix 
location and the second number is the last bit of the string. 

□ 

Typically, Lempel-Ziv is applied to an input sequence of 
fixed length. Here, we are interested in Lempel-Ziv code with 
fixed output length and variable input length. As a result, 
we can apply a single fixed-length extractor to the output of 
Lempel-Ziv code for extracting randomness. In our algorithm, 
we read raw bits one by one from an imperfect source until the 
length of the output of a Lempel-Ziv code reaches a certain 
length. In another word, the number of strings after parsing 
is a predetermined number c. For example, if the source is 
1011010100010... and c = 4, then after reading 6 bits, we can 
parse them into 1,0,11,01. Now, we get an output sequence 
(000, 1), (000, 0), (001, 1), (010, 1), which can be used as the 
input of a fixed-length extractor. We call this Lempel-Ziv code 
as a variable-length Lempel-Ziv code. 

Let Z be a random sequence obtained based on variable- 
length Lempel-Ziv code such that its length is 

\Z\ = (logc+l)c, 

for a predetermined c. Then Z is very close to truly random 
bits in the term of min-entropy if the source 1Z is stationary 
ergodic. As a result, we have the following construction for 
variable-length extractors. 

Construction 3. Assume the real source is 1Z and there exists 
a stationary ergodic process Ai such that d(JZ, M) < j3. 
Then we extract m almost random bits from 1Z based on the 
following procedure. 

1) Read input bits one by one based on the variable-length 
Lempel-Ziv code until we get an output sequence Z 
whose length reaches 

n = T^ (1 + e) ' 

where e > is a small constant indicating the perfor- 
mance gap between the case of finite-length and that 
of infinite-length for Lempel-Ziv code; as m — > oo, we 
have e — > 0. Similar as above, fi p = f3 + e p with e p > 
and k = m(l + a) with a > 0. The small constant e p 
has value depending on the input set S p ; as m — > oo, 
e p — > 0. The constant a can be arbitrarily small. Then 
we get a random sequence Z of length n and with min- 
entropy fc. 

2) Applying a (fc, e) extractor to Z yields a random se- 
quence of length m that is e-close to U m . □ 

We show that the min-entropy of Z is at least k as m — > 
oo. If m is not very large, by adjusting the parameter e, we 
can make the min-entropy of Z be at least k. So we can 
continue to apply an efficient fixed-length extractor to 'purify' 
the resulting sequence. Finally, we can get m random bits that 
satisfy our requirements on quality in the sense of statistical 
distance. 



Theorem 17. When m — > oo, Construction [3] generates a 
random sequence of length m that is e-close to U m . 

Proof: Let x be an input sequence. According to theorem 
12.10.1 in J5], for the stationary ergodic process M., we can 
get 

' ■ ' >^-\og 2 c-^-H(U,V), 



\x\ hg2 P M (x) ~ \x\ 



where 



tH(U,V) -¥ as |x| -¥ 0. 



As a result, if k = O(n), 



1 



lim - log 2 — — - > lim (1 - f3 p ) - log 2 
fc^OO k F-R\x) fc->oo k 



1 



'* ' /.• /'vi(.c) 
(1 - j8p)clog 2 c 



(1 - (i P )n 



> lim 

k— >oo 

= lim 

fe— >oo k 

= lim 1 + e 

k^oo 
= 1. 

Finally, we can get that 

,. H m i n [Z) H m in(X rn ) 

lim ; = lira > 1. 

fe— ¥00 k fc— S-OO k 

This implies that as m — > oo, i.e., k — > oo, the min-entropy 
of Z is at least k. 

Since k = m(l + a) for an a > 0, we can continue to apply 
a (k, e) extractor to extract m almost-random bits from Z. ■ 



B. Efficiency Analysis 

Now, we study the efficiency of the construction based on 
variable-length Lempel-Ziv codes. 

Theorem 18. Given a real source 7Z such that there exists a 
stationary ergodic process M. with d(lZ,A4) < (3, then the 
efficiency of Construction \3\ is 

1 - P < V < 1- 

Proof: Similar as above, we only need to prove that T) > 

1-/3. 

Since there are at most n = 2 c ( log2 c+1 ) distinct input 
sequences, their entropy 

H n (X m ) < c(log 2 c + 1) = n. 

According to the proof in Theorem [17] we have that the 
random sequence Z has min-entropy at least fc, and it satisfies 

lim — = -. 

m^oo fc 1 — p 

Based on the construction of seeded extractors, we can also 



get 



lim — = 1. 

m— ^oo k 



As a result, 



T) = lim 



> 1 
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This completes the proof. ■ 

Although Construction[3]has the same efficiency as the other 
constructions, when m is not large, it is less efficient than 
the other constructions because the Lempel-Ziv code does not 
always have the best performance when the input sequence is 
not long. But its advantage is that it can manage more general 
sources without accurate estimations. In the above theorem, 
the gap /3 represents how far the source 1Z is from being 
stationary ergodic. In general, the efficiency loss introduced 
by the uncertainty of sources is a part that cannot be avoid. 

Corollary 19. Given a real source 1Z such that there exists 
a stationary ergodic model M. with d(lZ,A4) < (3, then as 
ft — > 0, the efficiency of Construction \3\ is 

77 ^ 1. 

It shows that as /3 — > 0, Construction [3] reaches the 
Shannon's limit on efficiency. 

Corollary 20. Given a stationary ergodic source 1Z (assume 
we do not know that it is stationary ergodic), for the expected 
input length of Construction \3\ we have 

1 ,. E[\X m \] 1 
< ii m Ll m|J < , 

h(TZ) ~ ™^oo m ~ (1 - /3)h(TZ) 
where h(lZ) is the entropy rate oflZ. 

VII. Seedless Constructions 

To simulate seeded constructions of variable-length extrac- 
tors in randomized applications, we have to enumerate all 
possible assignments of the seed, hence, the computational 
complexity will be increased significantly. In real applications, 
we prefer seedless constructions rather than seeded construc- 
tions. It motivates us to study the seedless constructions of 
variable-length extractors in this section. 

A. An Independent Source 

Let us first consider a simple independent source described 
in the introduction. This type of sources have been widely 
studied in seedless constructions of fixed-length extractors. 

Example 6. Let x\X2--- £ {0, 1}* be an independent sequence 
generated from a source 7Z such that 

P[xi = 1] e [0.9,0.91] Vi>i. 

We see that the existing methods for generating random bits 
from ideal sources (like biased coins or Markov chains) cannot 
be applied here, since the probability of each bit is slightly 
unpredictable. Some seedless extractors have been developed 
for extracting randomness from such sources. In particular, 
there exists seedless extractors which are able to extract as 
many as H m - m (X) random bits from a independent random 
sequence X asymptotically. In order to extract m random bits 
in the above example, it needs to read : — m 1 input bits as 

log 2 oT5T r 

m — > 00. In this case, the entropy of the input sequence is in 

777 7T7 

10 S2 0.91 10 S2 0.91 



From which, we can get the efficiency of an optimal fixed- 
length extractor, which is 

V^xed G [0.2901,0.3117], 

i.e., about only 0.3 of the input entropy is used for generating 
random bits, which is far from optimal 

In the above example, we let M. be a biased coin model 
with probability p = 0.9072. In this case, 

/3 < d(K,M) = 0.0315. 

According to the constructions in the previous sections, there 
exists seeded variable-length extractors such that their efficien- 
cies are 

Vvariable G [1 - j8, 1] C [0.9685,1], 

which are near Shannon's limit. 

Based on the fact that the source is independent, we 
can eliminate the requirement of truly random bits as the 
seed, hence, we have seedless variable-length extractors. To 
construct a seedless variable-length extractor, we first apply 
a seedless fixed-length extractor E\ (which may not be very 
efficient) to extract a random sequence of length d from input 
bits. Using this random sequence as the seed, we continue 
to apply a seeded variable-length extractors E2 to extract m 
almost-random bits from extra input bits. So seedless variable- 
length extractors can be constructed as cascades of seedless 
fixed-length extractors and seeded variable-length extractors. 
Since the input length of E\ is much shorter (it is ignorable) 
than the input length of E%, the efficiency of the resulting 
seedless extractor, i.e., E = E2&)Ei, is dominated by the 
efficiency of E2. So the efficiency of the seedless extractor E 
is in [0.9685, 1], which is very close to the optimality. 

This example demonstrates a simple construction of seed- 
less variable-length extractors for independent sources, and 
it shows the significant performance gain of variable-length 
extractors compared to fixed-length extractors. 

B. Generalized Sources 

Here we consider a generalization of independent processes. 
Given a system, we use Xi denote the complete system status at 
time i. For example, in a system that generates thermal noise, 
the system status can include the value of the noise signal, 
the temperature, the environmental effects, etc. Usually, the 
evolution of such a system has a Markov property, namely, 

P[K+i, Ai+2, — | Ai , Xi—i, Ai] = P[Ai + i, \i+2, 

for all i > 1. Let X — x\X2--- £ {0, 1}™ be the binary se- 
quence generated from this system, then for any 1 < k < n— 1, 

P[Xt\X% +1 \\ k ] = PiXt^PiX^M (8) 

where X% — x a x a +i...Xb- In some sense, the source X that 
we consider is a hidden Markov process, but the number of 
hidden states can be infinite (A^ can be discrete or continuous). 

Example 7. One example of the above sources is the one 
studied in HI 2V . called a space s source. A space s source is 
basically a source generated by a width 2 s branching program. 
At each step, the state of the process generating the source is 
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in one of 2 s states, and the bit generated is a function of 
the current state. Unlike perfect Markov chains, the transition 
probabilities can be different at each step. In this example, the 
system status \ is the content of space s at time i, that is, 
one of the 2 s states, and Xi G {0, 1} is a function of \. 

Space s sources are very general in that most other classes of 
sources that have been considered previously can be computed 
with a small amount of space lfl2l . The model that we 
consider, as described by ©, is a natural generalization of 
space s sources. This model has a very nice feature: from 
such a source, we can get a group of sequences conditionally 
independent of each other. Namely, given system statues at 
some time points 

[A« A< 2 \...,AW] = [A a ,A 2a ,...,A 7a ], 

the subsequences 

[yo- 1 \^2a— 1 Y7a-1 y°° 1 

— KM J^a+l )—>- A ( 7 -l) a +l>- A 7a+lJ 

are conditionally independent of each other. Based on this 
condition, we have the following seedless construction of 
variable-length extractors. 

Construction 4. Given a source 1Z described by (©, we 
can construct a seedless variable-length extractor E in the 
following way: 

1) Suppose that 

tf min (X«|A»,A< i+1 )) > k d ,\/0 < i < 7. 

We construct a ^-source extractor HI 91/ E\ : 
[{0, l} a ~ 1 ]">' — y {0, l} d such that if each source has min- 
entropy k ( i, it can extract d almost-random bits which 
are e\-close to the uniform distribution on {0, l} d . 

2) We construct a seeded variable-length extractor E2 ■ 
S p x{0, l} d — > {0, l} m such that with condition on 

it can extract m almost-random bits from A^ 7+1 ) and 
these m almost-random bits are €2-close to the uniform 
distribution on {0, l} m if the seed is truly random. 

3) The seedless variable-length extractor E is a cascade 
of Ei and Ei'. Let 

D = ^(iW.lW,..,!^), 

then we apply D as the seed of E2 to generate m almost- 
random bits from X*- 7+1 ); that is, 

E(X) = E 2 (X^ +1 \E 1 (X^\X^,...,X^)). 

For this construction, we have the following theorems. 

Theorem 21. In Construction [4] the m almost-random bits 
generated by the seedless variable-length extractor E are (ei+ 
t2)-close to the uniform distribution on {0, l} m . 

Proof: According to the construction, we can let the 
parameter a = \X^ \ + 1 with 1 < i < 7 be large enough, so 
given A«,A( 2 \...,A^, 

\\D-U d \\<e 1 . 



Let X m be the input sequence of E2 that read from A^ 7+1 ), 
then given A^, we have 

\\E 2 (X m ,U d )-U m \\ <e 2 . 

From the two inequalities above, given 
we have 

\\E 2 (X m ,D)-U m \\ <ei + e 2 . 

Since it is true for any assignments of A^ 1 ' , A' 2 ), A^, 
we can get 

\\E 2 (X m ,D)-U m \\ 

2 P[X (1 \^\...,X^](e 1+ e2) 
\w,\( 2 \...,\(-<) 
< ei + £2- 

Hence, the m almost-random bits extracted by E is also 
(ei + e 2 )-close to U m . ■ 

In the following theorem, we show that the seedless 
variable-length extractor E has the efficiency as the seeded 
variable-length extractor £"2- 

Theorem 22. In Construction [4] suppose that 

ffmin(^ (l) |A (l) , A (l+1) ) = 9(|X W |),V0 < i < 7. 

Let t]e denote the efficiency of the resulting seedless variable- 
length extractor E, and let r]E 2 denote the efficiency of the 
E 2 , then 

Ve = Ve 2 ■ 

Proof: According to the construction of E±, we can get 

that 

d=Q(a), 

where a = \X^ | + 1 for 1 < i < 7. 
If €2 is a constant, then 

d = O(logm) = o(m). 

As a result, 

lim — = 0. 

Let H denote the entropy of the input sequence of E2, then 

VE 2 



limm^oo ^, and 



lim 

m— >o 

Hence, r\ E = Ve, 



m m 
< Ve < lim — • 

00 07 + H rn->oo H 



The theorem above shows that the efficiency of seedless 
variable-length extractors can be very close to optimality. For 
many sources, such as biased coins with noise, or Markov 
chains with noise, the existing algorithms for ideal sources 
(e.g., perfect biased coins or perfect Markov chains) cannot 
generate high-quality random bits from them. At the same 
time, the traditional approaches of fixed-length extractors are 
not very efficient. The gap between their efficiency and the 
optimality is determined by the bias of the source. Seedless 
variable-length extractors take the advantages of both, as 
a result, they can approach the information-theoretic upper 
bound on efficiency while being capable to combat noise in 
the sources. 
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VIII. Conclusion and Discussion 

In this paper, we introduced the concept of the variable- 
length extractors, namely, those extractors with variable input 
length and fixed output length. Variable-length extractors are 
generalizations of the existing algorithms for ideal sources 
to manage general stochastic processes. They are also im- 
provements of traditional fixed-length extractors to fill the gap 
between min-entropy and entropy of the source on efficiency. 
The key idea of constructing variable-length extractors is to 
approximate the source using a simple model, which is a 
known process, a biased coin, or a stationary ergodic process. 
Depending on the model selected, we proposed and analyzed 
three seeded constructions of variable-length extractors. Their 
efficiency is lower bounded by 1 — j3 and upper bounded by 
1 (optimality), where /3(0 < j3 < 1) indicate the uncertainty 
of the real source. We also show that our constructions are 
asymptotically optimal, in the sense that one cannot find a 
construction whose efficiency is always strictly larger than 
1 — j3. In addition, we demonstrated how to construct seedless 
variable-length extractors by cascading seeded variable-length 
extractors with seedless fixed-length extractors. They can work 
for many (but not all) natural sources such as those based on 
noise signals. 

There are certain connections between variable-length ex- 
tractors and a whole family of variable-to-fixed length source 
codes n3.03,Gll,(23l,(3|,(26|,(28l. With a variable-to- 
fixed length code, an infinite sequence is parsed into variable- 
length phases, chosen from some finite set T> of phases. Each 
phase is then coded into a binary sequence of fixed length m. 
The set V of phases is complete, i.e., every infinite sequence 
has a prefix in V. The key of constructing a good variable- 
to-fixed length source code is to find the best set V that 
consists of at last 2 m prefix-free phases and maximizes the 
expected phase length. As comparison, the key of constructing 
a variable-length extractor is to find the best input set S p 
that consists of sequences with probability at most 2~ k for 
each and minimizes the expected sequence length. Although 
their goals are different, some common ideas can be used 
to construct both the phase set T> and the input set S p . For 
example, in ll28ll . Visweswariah et al. defined the phase set T> 
by x* 6 V if P(x*) < c and no prefix of x* satisfies this 
property. The same idea is applied in our construction I. In 
03], [24 1, the phase set T> is determined by the number of 
ones and zeros in the phase, so is our construction II. In some 
sense, an optimal variable-to-fixed length code can result in a 
fixed-length binary sequence whose min-entropy is close to its 
length. However, variable-to-fixed length source codes do not 
always work well in constructing variable-length extractors, 
because (1) the designing criteria are different and they may 
degrade the performance; (2) variable- to-fixed length source 
codes take both encoding and decoding in consideration, 
hence, they are more complex in computation than what we 
require (decoding is not necessary) for constructing variable- 
length extractors; and (3) the sources that we considered for 
variable-length extractors are unpredictable, which are more 
general than the ones considered in variable-to-fixed length 
source codes. 
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